Time-slice Prediction of Dyadic Human Activities
نویسندگان
چکیده
Recognizing human activities from video data is being leveraged for surveillance and human-computer interaction applications. In this paper, we introduce the problem of time-slice activity recognition which aims to explore human activity at a smaller temporal granularity. Time-slice recognition is able to infer human behaviors from a short temporal window. It has been shown that the temporal slice analysis is helpful for motion characterization and in general for video content representation. These studies motivate us to consider time-slices for activity recognition. We present in Figure 1 an overview of our approach based on timeslice action prediction and contrast it with the conventional approaches which recognize actions based on either the whole video sequence (referred as “holistic” approach) or the first part of it (early recognition). Our time-slice approach studies not only the beginning of the action sequence but generalizes this to any short-term observation anywhere in the video sequence. Another key novelty is in the explicit modeling of the uncertainty occurring when predicting actions based on time-slices. TAP Dataset: We introduce a new dataset, named Time-slice Action Prediction (TAP) dataset, to evaluate our proposed feature descriptors and enable future research on this topic. The dataset was created by extracting time-slices from existing public human action datasets (UT-Interaction, HMDB, TV Interaction, and Hollywood datasets) and perform a perception study with multiple annotators giving continuous ratings for each action. The continuous ratings allow to represent the uncertainty in timeslice action prediction. 3 annotators rated each time-slice on how likely a specific action is occurring. For each time-slice and for each action, the annotator was asked to pick one of 5 likelihoods from “Definitely Not Occurring” to “Definitely Occurring”. Figure 3 illustrates how annotators rated for two example videos. Methodology: Stage 1Discriminative segments: When analyzing an interaction, we can definitely recognize the ongoing activity from specific time slices such as “two people are shaking each other’s hands” slice in handshaking activity. To extract discriminative segments from our dataset, we used Fleiss’ kappa coefficient k [2] to measure the reliability of agreement between annotators. For each interaction video, time-slices where the annotators are in complete agreement, i.e. k=1, on definitely including the interaction of interest, are selected as discriminative segments. Stage 2Predict-STIP: Existing STIP detectors are vulnerable to model the inherent uncertainty in partially observed action recognition Figure 2: Human annotation: This figure shows the average rate of 3 annotators for two video examples: hug and push. The label provided by one annotator is converted to a number on a linear scale from 0 to 1 called the average rate. This average rate will be used to evaluate the performance of our method. Time-slices between dashed lines is the discriminative segment of the interaction.
منابع مشابه
ZIAEEFARD, BERGEVIN, MORENCY: TIME-SLICE PREDICTION OF DYADIC ACTIVITIES 1 Time-slice Prediction of Dyadic Human Activities
Recognizing human activities from video data is being leveraged for surveillance and human-computer interaction applications. In this paper, we introduce the problem of time-slice activity recognition which aims to explore human activity at a smaller temporal granularity. Time-slice recognition is able to infer human behaviors from a short temporal window. It has been shown that the temporal sl...
متن کاملA Holistic Approach for Link Prediction in Multiplex Networks
Networks extracted from social media platforms frequently include multiple types of links that dynamically change over time; these links can be used to represent dyadic interactions such as economic transactions, communications, and shared activities. Organizing this data into a dynamic multiplex network, where each layer is composed of a single edge type linking the same underlying vertices, c...
متن کاملIdentification and risk assessment of midwife error in the labor using systematic human error reduction and prediction approach
Introduction: Labor is one of the most important wards of hospital, where human error is high. Midwifery errors in the maternity ward and in the delivery can be a serious threat to the health of the mother and the infant, resulting in increased treatment costs. Factors affecting human error are diversity in work, high workload, and fatigue. Therefore, this study aimed to evaluate the midwifery ...
متن کاملConversational Engagement Recognition Using Auditory and Visual Cues
Automatic prediction of engagement in human-human and human-machine dyadic and multiparty interaction scenarios could greatly aid in evaluation of the success of communication. A corpus of eight face-to-face dyadic casual conversations was recorded and used as the basis for an engagement study, which examined the effectiveness of several methods of engagement level recognition. A convolutional ...
متن کاملThe Effect of Mixed and Matched Level Dyadic Interaction on Iranian EFL Learners’ Comprehension and Production of Requests and Apologies
Drawing upon sociocultural theory of Vygotsky, the current study aims to investigate the effect of dyadic interaction in mixed and matched level proficiency pairings on comprehension and production of request and apology speech acts. The participants were 125 EFL learners who were randomly assigned to control and experimental (interaction) groups. Based on their scores in the pretest including ...
متن کامل